Statistical Linearization for Value Function Approximation in Reinforcement Learning

نویسنده

Matthieu Geist

چکیده

Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists in learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important RL subtopic is to approximate this function when the system is too large for an exact representation. This paper presents statistical-linearization-based approaches to estimate such functions. Compared to more classical approaches, this allows considering nonlinear parameterizations as well as the Bellman optimality operator, which induces some differentiability problems. Moreover, the statistical point of view adopted here allows considering colored observation noise models instead of the classical white one; in RL, this can provide useful.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Function Approximation in Hierarchical Relational Reinforcement Learning

Recently there have been a number of dif ferent approaches developed for hierarchi cal reinforcement learning in propositional setting We propose a hierarchical version of relational reinforcement learning HRRL We describe a value function approximation method inspired by logic programming which is suitable for HRRL

متن کامل

Active Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning

Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares policy iteration (LSPI) framework allows us to employ statistical active learning methods for linear regression. Then we propose a design method of good sampling policies for efficient exploration, which is particularl...

متن کامل

Bayesian Reinforcement Learning with Gaussian Process Temporal Difference Methods

Reinforcement Learning is a class of problems frequently encountered by both biological and artificial agents. An important algorithmic component of many Reinforcement Learning solution methods is the estimation of state or state-action values of a fixed policy controlling a Markov decision process (MDP), a task known as policy evaluation. We present a novel Bayesian approach to policy evaluati...

متن کامل

A Brief Survey of Parametric Value Function Approximation A Brief Survey of Parametric Value Function Approximation

Reinforcement learning is a machine learning answer to the optimal control problem. It consists in learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important subtopic of reinforcement learning is to compute an approximation of this value function when the system is too large ...

متن کامل

Model-based reinforcement learning using on-line clustering

A significant issue in representing reinforcement learning agents in Markov decision processes is how to design efficient feature spaces in order to estimate optimal policy. The particular study addresses this challenge by proposing a compact framework that employs an on-line clustering approach for building appropriate basis functions. Also, it performs a stateaction trajectory analysis to gai...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Statistical Linearization for Value Function Approximation in Reinforcement Learning

نویسنده

چکیده

منابع مشابه

Function Approximation in Hierarchical Relational Reinforcement Learning

Active Policy Iteration: Efficient Exploration through Active Learning for Value Function Approximation in Reinforcement Learning

Bayesian Reinforcement Learning with Gaussian Process Temporal Difference Methods

A Brief Survey of Parametric Value Function Approximation A Brief Survey of Parametric Value Function Approximation

Model-based reinforcement learning using on-line clustering

عنوان ژورنال:

اشتراک گذاری